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Abstract 

For graphs F and G an F -matching in G is a subgraph of G consisting of pairwise vertex disjoint copies 
of F. The number of F-matchings in G is denoted by s(F, G). We show that for every fixed positive integer 
m and every fixed tree F, the probability that s(F,7~ n ) = (mod m), where T n is a random labeled tree 
with n vertices, tends to one exponentially fast as n grows to infinity. A similar result is proven for induced 
F-matchings. This generalizes a recent result of Wagner who showed that the number of independent sets 
in a random labeled tree is almost surely a zero residue. 

1 Introduction 

The number of independent sets in graphs is an important counting parameter. It is particularly well-studied 
for trees and tree-like structures. Prodinger and Tichy showed in [9] that the star and the path maximize and 
minimize, respectively, the number of independent sets among all trees of a given size. Part of the interest in 
this graph invariant stems from the fact that the number of independent sets plays a role in statistical physics as 
well as in mathematical chemistry, where it is known as the Merrifield- Simmons index [5] . A problem that arises 
in this context is the inverse problem: determine a graph within a given class of graphs (such as the class of all 
trees) with a given number of independent sets. It is an open conjecture [6] (see also [5]) that all but finitely 
many positive integers can be represented as the number of independent sets of some tree. Recently Wagner 
[11] published a surprising result that may partially explain why the inverse problem for independent sets in 
trees is difficult. He showed that for every positive integer m, the number of independent sets in a random tree 
with n vertices is zero modulo m with probability exponentially close to one. Wagner's proof does not give an 
intuitive explanation of the aforementioned fact. In this paper we give a probabilistic proof for Wagner's result. 
Our proof is intuitive and simple, thus allowing us to generalize the result significantly. We refer the reader to 
|llj for further motivation and for a recent survey of previous results regarding the number of independent sets 
in trees. 

Another graph parameter popular in statistical physics and in mathematical chemistry is the Hosoya index 
which is the number of matchings in the graph. While the inverse problem for the number of matchings in 
trees is easy, as the star with n vertices has exactly n matchings, finding the distribution of this number is still 
open, as is the case with the number of independent sets. Wagner mentions in |llj that his method could be 
applied to the number of matchings as well, showing that asymptotically this number is typically divisible by 
any constant m. This may serve as an explanation for the hardness of obtaining distribution results. 

Both independent sets and matchings are special cases of F-matchings. Let F and G be graphs. An F- 
matching in G is a subgraph of G consisting of pairwise vertex disjoint copies of F. We say that the F-matching 
is induced in G if no additional edge of G is spanned by the vertices of G covered by the matching. These 
two closely related notions generalize naturally matchings and independent sets. Indeed, if F is the graph with 
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two vertices and one edge then an F-matching is simply a matching. If F is a single vertex then an induced 
F-matching is an independent set. 

Given graphs F and G we denote the set of F-matchings in G by S(F, G) and its size by s(F, G). The set 
of all induced F-matchings in G is denoted by S'(F, G) with s'(F, G) — \S'(F, G)\ being its size. 

In this paper G will be drawn at random from a probability space of graphs. We define the random tree T n 
to be the set of all n n ~ 2 labeled trees on n vertices endowed with the uniform distribution. 

Our main results are the following: 

Theorem 1. Let F be a tree that is not a single vertex and let m be a positive integer. Then there is a constant 
c = c(F, m) > such that the number of F-matchings in the random tree T n is zero modulo m with probability 
at least 1 — e~ cn . 

Note that when F is a single vertex, the number of F-matchings in any graph with n vertices is 2". 

Theorem 2. Let F be a tree and let m be a positive integer. Then there is a constant c' — c'{F,m) > 
such that the number of induced F-matchings in the random tree T n is zero modulo m with probability at least 
l-er c ' n . 

Wagner's result is an immediate consequence of Theorem [2] — simply take F to be a single vertex. 

In the next section we prove Theorem [TJ in Section [3] we describe a similar proof of the induced case and in 
the last section we state some extensions and conclude with a few remarks and open questions. Our extensions 
include the fact that the assertions of both theorems hold when the random tree T n is replaced by a random 
planar graph on n vertices. 

2 The non-induced case 

In this section we prove Theorem [TJ The proof is probabilistic and has two parts, a probabilistic claim (Lemma 
[3]) and a deterministic claim (Lemma [4]). Theorem [TJ is an immediate consequence of these claims. 

We shall use the following notation. Let T be a tree and assume that {u,v} is an edge in T. We define 
a rooted tree T^ u ' v ^ by first setting v as the root — this defines a direction of parenthood in T — and then 
removing u along with its descendants. Note that T^ u ^ is a rooted (undirected) tree. If R is a rooted tree 
isomorphic to T^ u ' v ' (a fact we denote by R = T^ u ' v ') for some edge {u,v} 6 T, we say that T has an R- 
leaf. The next Lemma states that for every fixed rooted tree R, a random tree has an F-leaf with probability 
exponentially close to 1. 

Lemma 3. Let R be a rooted tree. There exists a constant c = c{R) > such that 

Pr[3 {tt, v} G % s.t. R S T^ v) ] > 1 - e- cn . 

Proof. While our object of interest are trees, it is easier to work with functions on [n] = {1,2, ... ,n} via the 
Joyal mapping ([JJ, also presented in English in [TJ). 

We shall briefly describe the Joyal mapping and some of its properties that we need. The Joyal mapping 
maps /, a function from [n] to itself, to an undirected tree Tf over the set of vertices [n\. There are n n functions 
in [n][™], but only n n ~ 2 labeled trees over [n]. In order to make the mapping into a bijection we distinguish two 
vertices of a labeled tree by marking them left and right (we may mark one vertex with both) . Now the target 
set is the set of all labeled trees over [n] together with the markings, and is of size n n . 

The mapping is defined as follows. Let /: [n] — > [n]. Define G/ as the functional digraph^ with vertex set [n] 
and edge set {(i, f(i)) \i € [n]}. Every vertex in G/ has outdegree one, so every connected component has one 
directed cycle, and all edges that are not in a cycle are pointing towards the cycle. Let M = {ai < 02 < ■ ■ ■ < 
o-m} be the set of all vertices participating in a cycle of G/. Notice that M is the maximal set such that /|m 
is a bijection. To get Tf, the tree corresponding to the function /, we first define a path by taking the vertices 
of M and adding the m — 1 edges of the form {/(aj), /(a^+i)}. We then mark f(ai) as "left" and f(a m ) as 
"right". Finally we add the vertices in [n] \ M with the edges {i,f(i)} from Gf (forgetting about directions). 

1 A functional digraph is a directed graph with all outdegrees equal one. 
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Given a tree T with two such markings, we go back by defining M as the vertices in the path P connecting 
"left" and "right" , and directing all other vertices towards P. Sort the members of M according to their value 
and denote them by a\ < a2 < ■ ■ ■ < a m . We define / as follows. If i G M is the j'th vertex in the path then 
f(i) = a.j. If i £ M then there is one edge, emanating from i, and we set f(i) = j. It is easy to verify 

that this is indeed the inverse of the mapping described above. 

Notice that vertices that are not in a cycle are left by the Joyal mapping as they were in Gf, meaning that 
they will be incident with exactly the same edges as in the functional graph. In particular, edges with both 
endpoints being vertices that are not in a cycle of G/ will touch the same edges in Tf as in Gf. For our purpose, 
the fate of vertices lying in a cycle is irrelevant. 

Direct the edges of R towards the root to get R. Consider a random function / on [n] and let X be the 
random variable counting the number of directed edges (u, v) in Gf such that u, v and the ancestors of v in Gf 
do not belong to any cycle in Gf, and in addition, v and its ancestors form an isomorphic copy of R. 

Denote the vertices of R by ri, . . . , r^, the root being r&. Fix a (fc+l)-tuple of vertices of Gf, say 1,2,..., fc+1. 
The probability that the edge (fc, fc + 1) meets the condition described above is at least the probability that 
(fc, k + 1) G E(Gf), the mapping i — > ri is an isomorphism between R and G/[{1, . . . , fc}], and in addition, there 
are no other edges of Gf incoming to {l,...,fc + l}. The latter is 

/l\ fc (n - fc - 1 
\n J \ n 

In order to see this simply notice that for 1 < i < k there is only one valid target for f(i), while for i > k+ 1 it 
is enough to require that / will map i outside of{l,2,...,fc + l}. Therefore we get 

^ (*:,>-* H^P 

which implies EX — fi(n). 

We want to show that X is concentrated around its mean. Consider the value exposing martingale, in which 
we expose the values of / one by one. Now, changing the value of / in one coordinate, i, can ruin at most 
two copies of R (one using the edge («,/(«)) and another that now has an extra edge (i,f'(i))). Therefore 
the Lipschitz condition with constant two holds and we can apply the Azuma Inequality [3J [3] which yields 
Pi[X = 0] < e~ c " for some constant c > 0. 

Observe that if X(f) > then by the definition of X, the corresponding tree Tf contains the edge {u,v} 
requested by the proposition. 

As mentioned above, the Joyal correspondence is n 2 to one. If a labeled tree T does not contain an edge 
as required, then all its n 2 preimages / satisfy X(f) — 0. Therefore, the probability not to get a tree with a 
required edge is at most Pr[A = 0] < e~ cn as proven above. □ 

The next argument of the proof states the existence of a nullifying tree Z (depending on F and m) such 
that if a tree T has a Z-leaf then s(F, T) = (mod m). 

Lemma 4. Let F be a tree with at least one edge and let m be an integer. Then there exists a rooted tree Z 
such that, if Z = T (u ^ for some edge {u,v} G T, then s(F,T) = (mod to). 

Proof. The proof is constructive. By Proposition[5]to be proven below there exists a tree Y such that s(F, Y) = 
(mod to). 

Let A(F) be the maximal degree of F. To get Z take A(F) + 1 copies of Y, add a new vertex r to be viewed 
as the root of Z, and connect r to a vertex of each Y (thus adding A(F) + 1 edges). 

Let T be a tree and assume that Z = T^ u ' v ^ for some edge {u,v} G T. We wish to show that s(F,T) = 
(mod to). There are finitely many ways in which one may cover v by a copy of F, and it may also be that 
v remains uncovered. We classify F-matchings in T into classes Gi,G2,...,G g according to the copy of F 
covering v, with the set of F-matchings not covering v being a separate class Go- We argue that the number 
of _F-matchings in each such class is a zero residue. Indeed, the number of F-matchings in a given class Ci is 
precisely the number of F-matchings in the forest remaining from T after removing v and the copy covering it, if 
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there is one. In fact, this number is the product of the number of F-matchings in every connected component of 
the forest. By our construction of Z, at least one of the trees is this forest is isomorphic to Y . Since s(F, Y) = 
(mod m) we deduce that the number of F-matchings in the forest, and also in Ci, is zero modulo m. This is 
true for all d, and since S(F, T) = UCj one has s(F, T) = (mod m). □ 

Before stating and proving the next proposition we define some notation. Let F be a tree. Take a longest 
path in F and denote its vertices by u\, u-i, ■ ■ ■ , uj-fi, where I is the diameter of F. If we disconnect all edges of 
the form {iti,u.; + i} we get I + 1 subtrees. Let bi be the number of vertices in the subtree containing u,;. With 
this notation we have |F| = Yli=i hi- Since bi+\ = 1 we may also write |F| = 1 + Y^\=i bi- We shall use this 
notation in the proof of the next proposition and in the proof of Proposition [8] as well. 

Proposition 5. Let F be a tree with at least one edge and let m be an integer. Then there exists a rooted tree 
Y such that s(F,Y) = (mod m). 

Proof. Let Wt be a tree made of t copies of F in which we identify the vertex ui+\ of copy i with the vertex u\ 
of copy i + 1 (for 1 < i < t — 1). Let P C Wt be the path in Wt connecting the first copy of u\ to the last copy 
of iti+i, and number its vertices by 1, . . . , It + 1 in the natural order, from the copy of u\ in the first copy of F 
to the copy of ui+\ in the last copy of F. We want to have a direction of parenthood in Wt, so we set 1 to be 
the root. Notice that all connected components of Wt \ V[P] are of size strictly less than \F\. 

We are interested in embeddings of F in Wt, that is, in subgraphs of Wt that are isomorphic to F. Notice 
that every such embedding must have a vertex in P. Let C be an embedding of F in Wt- We call the vertex 
min{C fl P} the starting vertex of C. Consider the set of all starting vertices in Wt- If 1 < i < (t — 2)1 + 1 
is a starting vertex, then by symmetry so is i + I. Observe that trivially 1 is a starting vertex (and so are 
I + 1, 21 + 1, . . . ). By the symmetry argument above, if there are d starting vertices between 1 and I + 1 
(inclusive), then there are 1 + (t — l)(d— 1) starting vertices in Wt- To see this recall that 1 is always a starting 
vertex, and each copy but the last adds d — 1 starting vertices; also, the last copy of F in W t does not contain 
any starting vertices apart from 1 + l(t — 1) as deleting 1 + l(t — 1) leaves less than |F| vertices to the right of 
it. Similarly, if i is a starting vertex then there are d starting vertices between i and i + I, inclusive. 

Now we can define {Y r }, a family of subtrees of Wt a member of which will eventually be the sought after 
tree. Set t to be large enough (t = 1 + [~(r — l)/(d — 1)] will do). To get Y r take the minimal subpath of 
P C Wt containing the last r starting vertices and then append to each vertex in the subpath the subtree of its 
descendants through children outside P. For example, Yi is the single starting vertex 1 + l(t — 1) and Yd is the 
next to the last copy of F in Wt ■ 

Let g(r) be the number of F-matchings in Y r . We count such F-matchings by the membership of i, the first 
vertex in Y r . If i is not covered by the matching, then the next embedding of F begins no earlier than the next 
starting vertex. This means that the number of F-matchings of Y r in which i is not covered is g(r — 1). 

We argue now that if i is covered by the matching then the next d — 1 starting vertices are also covered. 
Let ip: F — ► Y r be an embedding covering i. We claim that the next d — 1 starting vertices are also covered 
by tp. First, since the diameter of F is /, no vertex of P farther than i + I (which is the starting vertex d — 1 
away from i) is covered by ip. On the other hand, the path from i to i + 1 — 1 contains one copy of each Ui 
(not necessarily in the natural order). Thus, the number of vertices in the set containing i, i + 1, . . . , i + I — 1 
and their descendants is exactly Ylj—-\ bi, hence ip extends also to i + 1. Therefore, the other embeddings in the 
F-matching need to start after i + l. We get that the number of such matchings is exactly g(r — d). This gives 
the recursion g(r) = g(r — 1) + g(r — d). 

Observe that the tree Y ri 1 < r < d, does not contain a copy of F, and thus the only F-matching in Y r is 
the empty one, implying g(r) = 1 for every 1 < r < d; also, g{d) = 2 as Yd = F. We can extend the recursion 
backwards by defining g(0) = 1 and g(— 1) = 0. By Claim [6] below there is an integer ro > such that g{ro) = 
(mod m). Define Y — Y ro . By the definition of g(r) we have s(F, Y) = (mod m). □ 

Claim 6. Let g(r) : N —> Z be a sequence of integers obeying a recurrence relation with integer coefficients 
ff( r ) = Si=i c i9( r ~ 0* Assume that g(0) = and cy = 1. Then for every positive integer m > there exists 
an index ro = ro(m) > such that g(ro) = (mod m). 

Proof. First we claim that g(r) (mod m) is periodic. Indeed, since g(r) (mod m) is determined by the e?-tuple 
of the previous d values, and since modulo m there are at most m d possible d-tuplcs, then after at most 
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m d steps the sequence g(r) (mod to) must become periodic. Next we claim that g(r) (mod to) is periodic 
from the beginning. To see this simply extend the sequence m d steps backwards using the recurrence relation 
g(r — d) = g(r) — J2i=i c i9( r ~ *)• The previous argument shows that the extended sequence is periodic starting 
at most at the TO d 'th element, which is the first element of the original sequence. Hence g(r) (mod to) is periodic 
from its first element, g(0) = 0, and thus there is some r > such that g(r ) = (mod to). □ 

3 The induced case 

In this section we prove Theorem [2] The proof is similar to the proof of Theorem Q] and we shall focus on the 
differences between the proofs. As before, the proof is probabilistic. Lemma [3] is the probabilistic part here as 
well, but the deterministic part is replaced by Lemma [7] below. 

We begin by constructing a nullifying rooted tree from copies of a tree Y' having s'(F,Y') = (mod to). 

Lemma 7. Let F be a tree and let to be an integer. There exists a rooted tree Z' such that if Z' = T^ u ' v > for 
some edge {u, v} € T, then s'(F,T) = (mod to). 

Proof. By Proposition [S] below there exists a tree Y' such that s'(F,Y') = (mod to). Construct Z' by taking 
A(F) + 2 copies of Y', adding a new vertex r to be viewed as the root of Z\ connecting one copy to r with a 
new edge and connecting the rest of the A(F) + 1 copies to r via a path of length two. 

Let T be a tree and assume that Z' = T^ u ' v ^ for some edge {u, v} G T. We need to show that s'(F, T) = 
(mod to). 

There are finitely many ways in which v may be covered by a copy of F, if it is covered at all. We classify 
induced F- matchings according to the copy of F covering v. Denote these classes by C\, . . . , Ck and let Co be 
the class of all induced F-matchings of T in which v is left uncovered. Clearly S'(F,T) = (J-L C%- We claim 
that |C,| = (mod to) for every < i < k. 

Consider first the class Co of induced F-matchings in T that leave v uncovered. The number of such 
matchings is the number of matchings in the forest remaining after deleting v. This forest has a component 
isomorphic to Y — the copy of Y that was connected to v by a single edge. The number of induced F- 
matchings in Co is then the product of the number of induced F-matchings in every connected component of 
the aforementioned forest which is zero modulo to. 

Consider now the class C for i > 0. As before, there is a natural one to one correspondence between induced 
F-matchings in T that belong to C and induced F-matchings of the forest remaining after removing the copy 
of F covering v and all neighbors of vertices in that copy. Since v is covered by the matching, all of its neighbors 
that are not covered by the same copy of F must remain uncovered. Otherwise, an additional edge outside the 
copies of F would be spanned. This means that in the above forest at least one of the A(F) + 1 copies that 
were connected to v by a path of length two will now remain as a connected component. Hence, the number of 
induced F-matchings in C is a zero residue. 

Summing the sizes of the C's we get that m'(F, T) = (mod to). □ 

Proposition 8. Let F be a tree and let to be an integer. Then there exists a rooted tree Y' such that s'(F, Y') = 
(mod to). 

Proof. The construction and the proof are similar to those in the proof of Proposition [5j and we shall use the 
notation defined just before it. We define W[ as a collection of t disjoint copies of F, and we add an edge 
between the vertex ui + i of the i'th copy and the vertex u\ of the (i + l)'th copy. We think of the first copy of 
ui as the root of W[. 

Let P' be the path connecting the first copy of u% with the last copy of ui + \ and denote its vertices by 
1, . . . , t(l + 1) in the natural order. We define starting vertices in the same manner as in the proof of Lemma|4] 
The symmetry argument still holds, only now the period is I + 1, that is, if 1 < i < (t — 2)(l + 1) + 1 is a starting 
vertex then so is i + I + 1. Also, if there are d starting vertices between 1 and I + 1, then there are d starting 
vertices between every starting vertex i and i + I and all in all there are (t — l)d + 1 starting vertices in W[. 

Let F r ' be the subgraph of W[ composed of the minimal path of P containing the last r starting vertices 
together with their descendants through vertices that are not in P. Hence, Y[ is a single vertex and Y£ +1 is a 



5 



copy of F with an extra vertex connected to it;+i. Finally we define g'(r) as the number of induced F-matchings 
in Y;. 

We wish to derive a recurrence formula for g'(r). We count induced F-matchings of by the membership 
of the first vertex. The number of induced F-matchings that do not cover the first vertex (who is also the first 
starting vertex) is exactly g'[r — 1). 

Consider matchings in which the first starting vertex i is covered. The embedding of F covering i can not 
cover vertices of P farther than i + I, since the diameter of F is I. On the other hand, the number of vertices in 
the subgraph made of the path connecting i to i + 1 together with their descendants that are not in P is exactly 
J^bi = \F\. Hence i + I is also covered by the same embedding that covers i. Now, if i + I + 1 is covered by 
another embedding of F, then {i + I, i + I + 1} is spanned, which is forbidden, so i + I + 1 is not covered. Since 
there are d starting vertices between i and i + I, and since i + I + 1 is a starting vertex as well, we get that the 
number of such matchings is exactly g'(r — d — 1). Therefore we have g'(r) = g'(r — 1) + g'(r — d — 1). 

Clearly g'(r) = 1 for every 1 < r < d — 1, as the number of vertices in in these cases is smaller than 
The value of g'(d) may be either 1 or 2, depending on whether F may be embedded into Yd or not. The value 
of g'(d + 1) can also be one of a few options. Still, we extend g' backwards by defining g'(0) = g'(d + 1) — g'(d), 
<?'(— 1) — g'(d) — g'(d — 1), and g'{— 2) = g'{d — 1) — g'(d — 2) = 0. We complete the proof by applying Claim 
E □ 



4 Concluding discussion 

Our initial objective was to provide a simple and intuitive explanation to the fact that almost all labeled trees 
have an even number of independent sets. Indeed, there are nullifying trees Z s.t. when a tree T has a Z-leaf, 
the number of independent sets in T is even. Also, every fixed tree Z appears as a Z-leaf in a random tree with 
n vertices with probability tending to one as n goes to infinity. Therefore almost all trees have an even number 
of independent sets. 

The simplicity of the explanation allowed vast generalizations — Theorems [T] and [5] above. In fact, the proof 
also works in other scenarios. If a probability space of graphs has a property corresponding to the probabilistic 
part of the proof, then the number of (induced) -F-matchings will be a zero residue in that probability space as 
well. 

As a concrete example, let V n be the random planar graph of order n, that is, V n is the set of all simple 
labeled planar graphs with n vertices endowed with the uniform distribution. In [7] it is shown that with 
probability exponentially close to one, P n has an i?-leaf for every fixed rooted tree R. Thus, by the above, the 
number of (induced) -F-matchings is a zero residue in a random planar graph. Notice that P n is connected with 
probability at least 1/e as shown in [7], so a potential simpler strategy of proving the same result — showing 
the existence of a component having a zero residue number of (induced) F-matchings — will not suffice. 

Similar results may be obtained for other random graphs models as well. On the other hand, if we consider 
dense random graphs then a different approach is required. For example, it is not clear how the number of 
independent sets typically behaves as a residue for the binomial random graph G(n, 1/2). Moreover, it is not 
difficult to show that for p = p{n) close to 1 in the range in which the maximum independent set of G(n,p) is 
0(1) > 1 asymptotically almost surely, the number of independent sets in G(n,p) is nearly uniformly distributed 
modulo any constant m. See [10] for several related results. 

Our proof implies that the number of F-matchings in a random tree of order n is typically zero modulo any 
constant m when the size of F grows slowly enough with n. It may be interesting to find the maximal rate of 
growth for which this property still holds. 

Acknowledgment We thank Alan Frieze for suggesting the use of the Joyal Correspondence in the proof of 
Lemma [3] 
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